Search CORE

273 research outputs found

Recommended from our members

Improving the Power of GWAS and Avoiding Confounding from Population Stratification with PC-Select

Author: Berger Bonnie
Price Alkes L.
Tucker George
Publication venue: 'Genetics Society of America'
Publication date: 13/08/2014
Field of study

Using a reduced subset of SNPs in a linear mixed model can improve power for genome-wide association studies, yet this can result in insufficient correction for population stratification. We propose a hybrid approach using principal components that does not inflate statistics in the presence of population stratification and improves power over standard linear mixed models

Harvard University - DASH

Application of Ancestry Informative Markers to Association Studies in European Americans

Author: Price Alkes L
Seldin Michael F
Publication venue: Public Library of Science
Publication date: 01/01/2008
Field of study

Directory of Open Access Journals

PubMed Central

Population Structure and Eigenanalysis

Author: Patterson Nick
Price Alkes L
Reich David
Publication venue: Public Library of Science
Publication date: 01/01/2006
Field of study

Current methods for inferring population structure from genetic data do not provide formal significance tests for population differentiation. We discuss an approach to studying population structure (principal components analysis) that was first applied to genetic data by Cavalli-Sforza and colleagues. We place the method on a solid statistical footing, using results from modern statistics to develop formal significance tests. We also uncover a general “phase change” phenomenon about the ability to detect structure in genetic data, which emerges from the statistical theory we use, and has an important implication for the ability to discover structure in genetic data: for a fixed but large dataset size, divergence between two populations (as measured, for example, by a statistic like F(ST)) below a threshold is essentially undetectable, but a little above threshold, detection will be easy. This means that we can predict the dataset size needed to detect structure

CiteSeerX

Public Library of Science (PLOS)

Harvard University - DASH

Directory of Open Access Journals

PubMed Central

Recommended from our members

Identifying Repeat Domains in Large Genomes

Author: Pevzner Pavel A
Price Alkes
Raphael Benjamin J
Tang Haixu
Zhi Degui
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 18/11/2010
Field of study

We present a graph-based method for the analysis of repeat families in a repeat library. We build a repeat domain graph that decomposes a repeat library into repeat domains, short subsequences shared by multiple repeat families, and reveals the mosaic structure of repeat families. Our method recovers documented mosaic repeat structures and suggests additional putative ones. Our method is useful for elucidating the evolutionary history of repeats and annotating de novo generated repeat libraries

Harvard University - DASH

Explicit Modeling of Ancestry Improves Polygenic Risk Scores and BLUP Prediction

Author: Chen Chia-Yen
Han Jiali
Hunter David J.
Kraft Peter
Price Alkes L.
Publication venue: 'Wiley'
Publication date: 01/09/2015
Field of study

Polygenic prediction using genome-wide SNPs can provide high prediction accuracy for complex traits. Here, we investigate the question of how to account for genetic ancestry when conducting polygenic prediction. We show that the accuracy of polygenic prediction in structured populations may be partly due to genetic ancestry. However, we hypothesized that explicitly modeling ancestry could improve polygenic prediction accuracy. We analyzed three GWAS of hair color (HC), tanning ability (TA), and basal cell carcinoma (BCC) in European Americans (sample size from 7,440 to 9,822) and considered two widely used polygenic prediction approaches: polygenic risk scores (PRSs) and best linear unbiased prediction (BLUP). We compared polygenic prediction without correction for ancestry to polygenic prediction with ancestry as a separate component in the model. In 10-fold cross-validation using the PRS approach, the R(2) for HC increased by 66% (0.0456-0.0755; P < 10(-16)), the R(2) for TA increased by 123% (0.0154 to 0.0344; P < 10(-16)), and the liability-scale R(2) for BCC increased by 68% (0.0138-0.0232; P < 10(-16)) when explicitly modeling ancestry, which prevents ancestry effects from entering into each SNP effect and being overweighted. Surprisingly, explicitly modeling ancestry produces a similar improvement when using the BLUP approach, which fits all SNPs simultaneously in a single variance component and causes ancestry to be underweighted. We validate our findings via simulations, which show that the differences in prediction accuracy will increase in magnitude as sample sizes increase. In summary, our results show that explicitly modeling ancestry can be important in both PRS and BLUP prediction

IUPUIScholarWorks

PubMed Central

Recommended from our members

Fast and accurate long-range phasing in a UK Biobank cohort

Author: Loh Po-Ru
Palamara Pier Francesco
Price Alkes L
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 03/01/2017
Field of study

Recent work has leveraged the extensive genotyping of the Icelandic population to perform long-range phasing (LRP), enabling accurate imputation and association analysis of rare variants in target samples typed on genotyping arrays. Here, we develop a fast and accurate LRP method, Eagle, that extends this paradigm to populations with much smaller proportions of genotyped samples by harnessing long (>4cM) identical-by-descent (IBD) tracts shared among distantly related individuals. We applied Eagle to N≈150,000 samples (0.2% of the British population) from the UK Biobank, and we determined that it is 1–2 orders of magnitude faster than existing methods while achieving similar or better phasing accuracy (switch error rate ≈0.3%, corresponding to perfect phase in a majority of 10Mb segments). We also observed that when used within an imputation pipeline, Eagle pre-phasing improved downstream imputation accuracy compared to pre-phasing in batches using existing methods (as necessary to achieve comparable computational cost)

Harvard University - DASH

Progress and promise in understanding the genetic basis of common diseases

Author: Donnelly Peter
Price Alkes L.
Spencer Chris C. A.
Publication venue: 'The Royal Society'
Publication date: 01/01/2015
Field of study

Susceptibility to common human diseases is influenced by both genetic and environmental factors. The explosive growth of genetic data, and the knowledge that it is generating, are transforming our biological understanding of these diseases. In this review, we describe the technological and analytical advances that have enabled genome-wide association studies to be successful in identifying a large number of genetic variants robustly associated with common disease. We examine the biological insights that these genetic associations are beginning to produce, from functional mechanisms involving individual genes to biological pathways linking associated genes, and the identification of functional annotations, some of which are cell-type-specific, enriched in disease associations. Although most efforts have focused on identifying and interpreting genetic variants that are irrefutably associated with disease, it is increasingly clear that—even at large sample sizes—these represent only the tip of the iceberg of genetic signal, motivating polygenic analyses that consider the effects of genetic variants throughout the genome, including modest effects that are not individually statistically significant. As data from an increasingly large number of diseases and traits are analysed, pleiotropic effects (defined as genetic loci affecting multiple phenotypes) can help integrate our biological understanding. Looking forward, the next generation of population-scale data resources, linking genomic information with health outcomes, will lead to another step-change in our ability to understand, and treat, common diseases

Harvard University - DASH

Oxford University Research Archive